fix(daemon): run async kubo cleanup before process.exit on startup failure (issue #98)#99
Conversation
…ilure (issue #98) The daemon registers an asyncExitHook (exit-hook) to kill kubo and destroy the RPC server on shutdown. exit-hook only runs async hooks for SIGINT/SIGTERM/ beforeExit; an explicit process.exit() runs only synchronous hooks and prints "SYNCHRONOUS TERMINATION NOTICE". The notice's real trigger is the startup-failure path, not the force-quit guard at daemon.ts:655 (by the second signal exit-hook's internal isCalled is already true, so its synchronous handler returns before the warning). When the hook is registered (daemon.ts:600), kubo is spawned by keepKuboUp() (:688), and then createOrConnectRpc() (:689) throws — e.g. the #87/#97 TOCTOU RPC-port race — the throw unwinds to oclif's error handler (@oclif/core errors/handle.js), which calls process.exit(1). That skips the async cleanup: the notice prints, kubo is orphaned, and this daemon's state file is left behind. Fix: - Extract the async exit-hook body into a named shutdownDaemon() and capture both the hook's unsubscribe handle and the fn (hoisted above the try). - In the startup-failure catch, await shutdownDaemon() (kills kubo, destroys the RPC server, deletes the state file) and drop the hook before re-throwing, so oclif's process.exit() runs clean. No-ops if we failed before the hook was registered. - In the force-quit guard, drop the hook before its deliberate process.exit() as belt-and-suspenders; kubo is still SIGKILLed by the emergency "exit" handler. Test: - Add a guarded test hook PKC_CLI_TEST_FAIL_AFTER_KUBO_START that throws at the top of createOrConnectRpc, after kubo is up, mirroring the real port race. - New daemon.test.ts case "kills kubo (and prints no exit-hook termination notice) when startup fails after kubo is up": asserts non-zero exit, no SYNCHRONOUS TERMINATION NOTICE, and kubo stopped (not orphaned). Verified red before the fix, green after. Full daemon.test.ts (23), update-install-restart-race, and daemon-kubo-restart-race suites pass.
|
Warning Review limit reached
More reviews will be available in 32 minutes and 32 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThe PR fixes issue ChangesDaemon Async Shutdown & Kubo Orphan Prevention
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cli/commands/daemon.ts`:
- Around line 748-755: The startup-failure shutdown should be capped with
DAEMON_SHUTDOWN_TIMEOUT_MS just like the signal path: replace the direct await
of runDaemonShutdown() with a timed await (e.g., Promise.race between
runDaemonShutdown() and a timeout promise that resolves/rejects after
DAEMON_SHUTDOWN_TIMEOUT_MS) so shutdownDaemon()/daemonServer.destroy() cannot
hang forever; preserve the existing .catch(...) behavior to swallow shutdown
errors and ensure removeAsyncExitHook?.() still runs afterward.
In `@test/cli/daemon.test.ts`:
- Around line 884-895: Add an assertion that the injected failure reason is
present by checking the combined output for the exact failure message; after
constructing combinedOutput (from result.stdout and result.stderr) add
expect(combinedOutput).toContain("Simulated startup failure after kubo start")
so the test ensures the PKC_CLI_TEST_FAIL_AFTER_KUBO_START branch was hit when
calling runPkcDaemonExpectFailure.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a26718a1-77d0-4425-8b0c-b4bfddb2db18
📒 Files selected for processing (2)
src/cli/commands/daemon.tstest/cli/daemon.test.ts
…ted path (issue #98) Address CodeRabbit review on #99: - Cap the manual startup-failure shutdown with DAEMON_SHUTDOWN_TIMEOUT_MS (Promise.race with an unref'd timer), matching the timeout contract the signal path gets from exit-hook's `wait`, so a hung daemonServer.destroy() can't swallow the original startup error forever. Low live risk (daemonServer is ~never set on the catch path and killKuboProcess is already bounded), but it keeps both exit paths consistent. - Assert the test's combined output contains the injected failure message so the case stays pinned to the PKC_CLI_TEST_FAIL_AFTER_KUBO_START path instead of passing on any non-zero failure. Wrap the spawn in withKuboBindRetry so a lost kubo bind race (issue #87) retries with fresh ports rather than flaking the new message assertion. daemon.test.ts (23), update-install-restart-race, and daemon-kubo-restart-race all pass.
Closes #98
Problem
The daemon registers an
asyncExitHook(exit-hook) to kill kubo and destroy the RPC server on shutdown. exit-hook only runs async hooks for SIGINT/SIGTERM/beforeExit; an explicitprocess.exit()runs only synchronous hooks and prints theSYNCHRONOUS TERMINATION NOTICE.The notice's real trigger is not the force-quit guard at
daemon.ts:655(by the second signal exit-hook's internalisCalledis alreadytrue, so its synchronous handler returns before the warning). It's the startup-failure path:daemon.ts:600)keepKuboUp()spawns kubo (:688)createOrConnectRpc()(:689) throws — e.g. the Test flake: hardcoded kubo API ports in macOS ephemeral range cause intermittent 'address already in use' #87/Flaky daemon tests: bind-race retry predicate misses the daemon's 'PKC RPC port became occupied' fail-fast (lineage #87) #97 TOCTOU RPC-port race@oclif/coreerrors/handle.js), which callsprocess.exit(1)Fix (
src/cli/commands/daemon.ts)shutdownDaemon()and capture both the hook's unsubscribe handle and the fn (hoisted above thetry).catch:await shutdownDaemon()(kills kubo, destroys the RPC server, deletes the state file) and drop the hook before re-throwing, so oclif'sprocess.exit()runs clean. No-ops if we failed before the hook was registered.process.exit()(belt-and-suspenders); kubo is still SIGKILLed by the emergencyprocess.on("exit")handler.Test
PKC_CLI_TEST_FAIL_AFTER_KUBO_STARTthrows at the top ofcreateOrConnectRpc, after kubo is up — mirroring the real port race.daemon.test.tscase: "kills kubo (and prints no exit-hook termination notice) when startup fails after kubo is up" — asserts non-zero exit, noSYNCHRONOUS TERMINATION NOTICE, and kubo stopped (not orphaned).Verification
npm run build && npm run build:testpassdaemon.test.ts(23 tests),update-install-restart-race, anddaemon-kubo-restart-racesuites passSummary by CodeRabbit
Release Notes
Bug Fixes
Tests